JUZBOX: A web server for extracting biomedical words from the protein sequence

نویسندگان

  • Paul Bobby
  • Seetharaman Balaji
  • Variath Sathyanath
  • Santhosh J Eapen
چکیده

UNLABELLED The recognition of gene/protein names in literature is one of the pivotal steps in the processing of biological literatures for information extraction or data mining. We have compiled a lexicon of biomedical words (conserved patterns/ potential motifs) which has the combination of only 20 alphabets of amino acids. The remaining 6 letters of the English alphabets (B, J, O, U, X, Z) are treated as invalid amino acid characters (to our context), We have jumbled the 6 letters for the sake of usage and convenience and termed as 'JUZBOX' and these characters were filtered in the biomedical lexicon. Undoubtedly, the generation of biomedical words from protein sequence using JUZBOX have applications specific for functional annotation. AVAILABILITY JUZBOX is available freely at http://www.spices.res.in/juzbox.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

In Silico Prediction and Docking of Tertiary Structure of Multifunctional Protein X of Hepatitis B Virus

Hepatitis B virus (HBV) infection is a universal health problem and may result into acute, fulminant, chronic hepatitis liver cirrhosis, or hepatocellular carcinoma. Sequence for protein X of HBV was retrieved from Uniprot database. ProtParam from ExPAsy server was used to investigate the physicochemical properties of the protein. Homology modeling was carried out using Phyre2 server, and refin...

متن کامل

High Fuzzy Utility Based Frequent Patterns Mining Approach for Mobile Web Services Sequences

Nowadays high fuzzy utility based pattern mining is an emerging topic in data mining. It refers to discover all patterns having a high utility meeting a user-specified minimum high utility threshold. It comprises extracting patterns which are highly accessed in mobile web service sequences. Different from the traditional fuzzy approach, high fuzzy utility mining considers not only counts of mob...

متن کامل

ESPript/ENDscript: extracting and rendering sequence and 3D information from atomic structures of proteins

The fortran program ESPript was created in 1993, to display on a PostScript figure multiple sequence alignments adorned with secondary structure elements. A web server was made available in 1999 and ESPript has been linked to three major web tools: ProDom which identifies protein domains, PredictProtein which predicts secondary structure elements and NPS@ which runs sequence alignment programs....

متن کامل

GeneInfoMiner - a web server for exploring biomedical literature using batch sequence ID

GeneInfoMiner is a web-based system for searching Medline abstracts using sequence ID lists such as GenBank accession numbers derived from high-throughput experiments. It will map query results to MeSH topics to facilitate the exploration of the biological significance of the sequence ID lists. GeneInfoMiner is based on a custom gene and protein name identification engine that can map gene and ...

متن کامل

Expression, Purification and Docking Studies on IMe-AGAP, the First Antitumor-analgesic Like Peptide from Iranian Scorpion Mesobuthus eupeus

Scorpion venom contains different toxins with multiple biological functions. IMe-AGAP is the first Analgesic-Antitumor like Peptide (AGAP) isolated from Iranian scorpion Mesobuthus eupeus. This peptide is similar to AGAP toxin with high analgesic activity, extracted from Chinese scorpion and inhibits NaV1.8 and NaV1.9 voltage-gated sodium channels involved in the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 4  شماره 

صفحات  -

تاریخ انتشار 2009